Skip to content

Conversation

@sbaldu
Copy link
Contributor

@sbaldu sbaldu commented Apr 7, 2025

This PR caches the device properties once they have been accessed so as to reduce overheads for multiple consecutive calls.

@sbaldu sbaldu marked this pull request as draft April 7, 2025 08:02
@sbaldu sbaldu marked this pull request as ready for review April 7, 2025 08:52
Comment on lines 104 to 105
std::shared_ptr<alpaka::DeviceProperties> m_deviceProperties;
std::shared_ptr<std::mutex> m_mutex;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move these into the cpu::detail::DevCpuImpl class ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, done

@SimeonEhrig
Copy link
Member

@sbaldu The AMD CI runner has driver problems. I talk to the support. Don't care about the failed HIP runtime tests for the moment.

@sbaldu sbaldu force-pushed the cache-device-properties branch from 28ff342 to 68c8a6a Compare April 9, 2025 15:44
@sbaldu sbaldu force-pushed the cache-device-properties branch from 68c8a6a to e49e604 Compare April 9, 2025 15:48
@sbaldu
Copy link
Contributor Author

sbaldu commented Apr 9, 2025

@sbaldu The AMD CI runner has driver problems. I talk to the support. Don't care about the failed HIP runtime tests for the moment.

Ok thanks. I see that the rest of the tests are passing.

@sbaldu sbaldu requested a review from fwyzard April 9, 2025 17:13
@SimeonEhrig
Copy link
Member

@sbaldu The AMD CI runner has driver problems. I talk to the support. Don't care about the failed HIP runtime tests for the moment.

Ok thanks. I see that the rest of the tests are passing.

The problem still exist. I'm on it. It takes a little bit longer, because the administrator is in vacations and I'm working together with another administrator which overtakes it temporary.

@SimeonEhrig
Copy link
Member

@sbaldu Sorry, I did miss take. Your code has a bug. I can reproduce it locally.
It looks like a well know bug in the CI but the CI is working. I mixed up this PR with your other PR #2490 and therefore I thought your changes cannot break the tests. But the changes of the code of this PR looks like, that they can produce a dead look.

@sbaldu
Copy link
Contributor Author

sbaldu commented Apr 10, 2025

@sbaldu Sorry, I did miss take. Your code has a bug. I can reproduce it locally. It looks like a well know bug in the CI but the CI is working. I mixed up this PR with your other PR #2490 and therefore I thought your changes cannot break the tests. But the changes of the code of this PR looks like, that they can produce a dead look.

Ok thanks for the update. I'll fix it asap.

@sbaldu sbaldu force-pushed the cache-device-properties branch from 309f1bd to ef3a99a Compare April 11, 2025 08:32
@sbaldu sbaldu force-pushed the cache-device-properties branch from de53bef to c130965 Compare April 11, 2025 10:04
@fwyzard
Copy link
Contributor

fwyzard commented May 6, 2025

Now I'm looking at parts of alpaka I never read before :-D

I understand why you added the properties and flag to the QueueRegistry class - the CPU and CUDA/HIP devices have a shared_ptr to an instance of that class.

From a semantic (?) point of view, something called a QueueRegistry does not seem to be the right place for it :-/

@psychocoderHPC what do you think ?
I would rather avoid adding a second shared state, so reusing the QueueRegistry seems to make sense.
Maybe we could rename it to something more in line with the added features ?
What does this code look like in alpaka 3 ?

@fwyzard
Copy link
Contributor

fwyzard commented May 6, 2025

Instead, from a more technical point of view, I would prefer to move more of the common code to a reusable function, to avoid repetitions.

Could you try to move the setDeviceProperties directly in the class that owns the once_flag and the DeviceProperties, at least for the SYCL case ?

For the CPU and CUDA/HIP cases it seems more complicated.

@fwyzard fwyzard requested a review from psychocoderHPC May 14, 2025 05:00
@sbaldu sbaldu force-pushed the cache-device-properties branch from c87afe2 to b137daa Compare May 21, 2025 10:24
@sbaldu
Copy link
Contributor Author

sbaldu commented May 21, 2025

ok now all the tests seem to be passing.
@SimeonEhrig @psychocoderHPC I renamed the QueueRegistry as DevGenericImpl, which is similar to the naming used for SYCL, is that ok?

@AuroraPerego AuroraPerego dismissed psychocoderHPC’s stale review May 27, 2025 14:57

Changes approved in the weekly meeting

@AuroraPerego AuroraPerego merged commit 5bc1579 into alpaka-group:develop May 27, 2025
25 checks passed
@sbaldu sbaldu deleted the cache-device-properties branch November 27, 2025 18:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants